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Abstract 

Bayesian analysis of data from the general linear mixed model is challenging because any 
nontrivial prior leads to an intractable posterior density. However, if a conditionally conjugate 
prior density is adopted, then there is a simple Gibbs sampler that can be employed to explore 
the posterior density. A popular default among the conditionally conjugate priors is an improper 
prior that takes a product form with a flat prior on the regression parameter, and so-called power 
priors on each of the variance components. In this paper, a convergence rate analysis of the cor- 
responding Gibbs sampler is undertaken. The main result is a simple, easily-checked sufficient 
condition for geometric ergodicity of the Gibbs Markov chain. This result is close to the best 
possible result in the sense that the sufficient condition is only slightly stronger than what is re- 
quired to ensure posterior propriety. The theory developed in this paper is extremely important 
from a practical standpoint because it guarantees the existence of central limit theorems that 
allow for the computation of valid asymptotic standard errors for the estimates computed using 
the Gibbs sampler 
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1 Introduction 



The general linear mixed model (GLMM) takes the form 

Y = X/3 + Zu + e , 

where y is an iV x 1 data vector, X and Z are known matrices with dimensions N x p and N x q, 
respectively, /3 is an unknown p x 1 regression coefficient, u is a random vector whose elements 
represent the various levels of the random factors in the model, and e ~ N7v(0, cXg/). The random 
vectors e and u are assumed to be independent. Suppose there are r random factors in the model. 
Then u and Z are partitioned accordingly as u = (ui U2 ■ ■ ■ u ^) and Z = (Zi Z2 • • • Zr), where 
Ui is Qi X 1, Zi is N X Qi, and qi + ■ ■ ■ + q,. = q. Then 

r 

Zu = '^ ZiUi , 
1=1 

and it is assumed that u Ng(0, D), where D = (Bi-iCT ?,Iq,- For backgroun d on this model, which 



is sometimes called the variance components model, see iSearle et al. 



(1992). 



A Bayesian version of the GLMM can be assembled by specifying a prior distribution for the 
unknown parameters /3 and cr^, where cr^ = (fTg cr^^ • • • cr^^) denotes the vector of variance 
components. A popular choice is the proper (conditionally) conjugate prior that takes /3 to be mul- 
tivariate normal, and takes each of the variance components to be inverted gamma. In situations 
where there is little prior information, the hyperparameters of this proper prior are often set to ex- 
treme values as this is thought to yield a "non-informative" prior. Unfortunately, these extreme 
proper priors approximate improper priors that correspond to improper posteri ors, and this re sults 

DanielJ ( 1999 ) and 



in various form s of instability. This problem has led several authors, including 



Gelman 



(120061) . to discourage the use of such extreme proper priors, and to recommend alternative 
default priors that are improper, but lead to proper posteriors. Consider, for example, the one-way 
random effects model given by 

Yij = f3o + Ui + e-ij 1 

where i = 1, . . . , c, j = 1, . . . , n^, the UiS are iid N(0, a\), and the eijS, which are independent 
of the UiS, are iid N(0, dg). This is the so-called "non-centered" version of the one-way model. 
It is an important special case of our GLMM. (In the alternative "centered" parameterization, the 
parameter /3o does not appear in the model equation, but rather as the mean of the n^s.) The standard 
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diffuse prior for this model, which is among those recommended by IGelmanI (120061) . has density 
1/ [a'^y^Tf). In fact, many of the improper priors for the GLMM that have been suggested and 
studied in the literature take the form of a reciprocal of a product of polynomials in the variance 
components. One obvious reason for using such a prior is that the resulting posterior has conditional 
densities with standard forms, and this facilitates the use of the Gibbs sampler. 
In this paper, we consider the following parametric family of priors for u^): 



p{/3,a'^;a,b) 



n( 

i=l 



(1) 



where a = (oe, oi, . . . , a,.) and b = {be,bi, . . . ,br) are fixed hyperparameters. By taking b to be 0, 
we can recover the reciprocal polynomial priors described above. Note that /3 does not appear on 
the right-hand side of ©; that is, we are using a so-called flat prior for Consequently, even if all 
the elements of a and b are strictly positive, so that every variance component gets a proper prior, 
the overall prior remains improper. There have been several studies concerni ng posterior propriety 



in this context, and the most general of these was done by 



SunetaL 



(|2001h . We state their main 
result here so that it can be used in a comparison later in this section. Define = u^)^ and 
W = {X Z), so that W6 = Xp + Zu. Let y denote the observed data, and let </>d(x; ^, S) denote 
the Nd(/i, S) density evaluated at the vector x. By definition, the posterior density is proper if 



m{y):=[ [ tt* {9, a^\y) dO da^ < oo , 



where 



7r*{9, a^\y) = ^Niv, W9, a^I) <Pg {u; 0, D) p{^, a^; a, h) . 
The following result provides sufficient (and nearly necessary) conditions for propriety. 

Theorem 1. [Sun, Tsutakawa & He (2001)] Assume that rank{X) = p and let t = rank(^Z^ [I 
X{X'^ X)^^X'^)Zy If the following four conditions hold, then m{y) < oo. 

(A) For each iG{l,2,...,r}, one of the following holds: 



(Al) ai<bi = 0; (A2) h > 



(B) For each i € {1,2, ... , r}, qi + 2aj > q — t 

(C) N + 2ae>p-2 J2l=i a*/(_oo,o) 



(D) 2be + \\{I -W{W^W)-W^)y\\'^ > 

When m{y) < oo, the posterior density is well defined (i.e. proper) and is given by 7r(^, (t^ |y) = 
7r*(6', cr^ly) / m{y). It is well known that 7r(0, cr^jy) is intractable in the sense that posterior expec- 
tations cannot be computed in closed form, nor even by classical Monte Carlo methods. How- 
ever, there is a simple two-step Gibbs sampler that can be used to approximate the intractable pos- 
terior expectations. This Gibbs sampler simulates a Markov chain, {{On,o'n)}'^=Q, that lives on 



X 



pr+l 



, and has invariant density 7r(6', cr'^\y)- If the current state of the chain is (6'„, o"^), 
then the next state, {6n+i,o''^j^i), is simulated using the usual two steps. Indeed, we draw 6n+i 
from 7r(0|(T^,y), which is a (p + (7) -dimensional multivariate normal density, and then we draw 
(t'^j^i from 7r((T^|0n+i, y), which is a product of r + 1 univariate inverted gamma densities. The 
exact forms of these conditional densities are given in Section |2] 

Because the Gibbs Markov chain is Harris ergodic (see Section |2ll, we can use it to construct 
consistent estimates of intractable posterior expectations. For A; > 0, let Lk{n) denote the set of 



functions g : 



o+q 



s'r+l 



. such that 



E^lgl'' := [ [ \g{e,a'^)\''TT{e,a'^\y)d9da^ <oo . 



If 5 G Li{it), then the average g^^ := ^Y17^ di^mcr"^) is a strongly consistent estimator 
of E^^g, no matter how the chain is started. Of course, in practice, an estimator is only use- 
ful if it is possible to compute an associated standard error. All available methods of computing 
a valid asympt otic standard error for are based on the existence of a central limit theorem 



(CLT) for g^ dFlesal et al 



2008 



Jones et al. 



20061) . Unfortunately, even if 5 € L 1^(11) for all 



A: > 0, Harris ergodici t y is not eno ugh to guarantee the existence of such a CLT for g^ (see, e.g.. 



Roberts and Rosenthal . 



1998 



20041) . The standard method of establishing the existence of CLTs is 



to prove that the underlying Markov chain converges at a geometric rate. 

Let B{X) denote the Borel sets in X, and let : X x B{X) [0, 1] denote the n-step Markov 
transition function of the Gibbs Markov chain. That is, [{6 , a'^) , Aj is the probability that 
{On, cTn) ^ given that the chain is started at (6*0, (Tq) = (0, a"^). Also, let n(-) denote the posterior 
distribution. The chain is called geometrically ergodic if there exist a function M : X — )■ [0, 00) and 
a constant g G [0, 1) such that, for all (0, u^) G X and all n = 0, 1, . . . , we have 
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where || • ||tv denotes the total variation norm. The relationship between geometric convergence 
and CLTs is simple: If the chain is geometrically ergodic and E^^lgl"^^^ < oo for some 6 > 0, 
then satisfies a CLT. Our main result (Theorem |2] in Section |3]) provides conditions under which 
the Gibbs Markov chain is geometrically ergodic. Checking these conditions may require some 
numerical work. The following corollary to Theorem|2]is weaker, but easier to apply. 

Corollary 1. Assume that rank{X) = p. If the following four conditions hold, then the Gibbs 
Markov chain is geometrically ergodic. 

(A) For each i £ {1,2, . . . ,r}, one of the following holds: 

(Al) ai<bi = 0; (A2) h > 

(B') For each i e {1,2, ... ,r}, qi + 2ai > q - t + 2 

(C) N + 2ae>p + t + 2 

(D) 2be + W{W^W)-W'^) yip > 

In cases where the posterior density is improper, it is sometimes still possible to run the Gibbs 
sampler, but the corresponding Markov chains cannot be geometrically ergodic (see Section 
Therefore, the best we could hope for is that the Gibbs Markov chain is geometrically ergodic 
whenever the posterior is proper. With this in mind, note that the conditions of Corollary [T] are very 
close to the conditions for propriety given in Theorem [T] In fact, the former imply the latter. To see 
this, note that (B') clearly implies (B), which, in turn, impUes that 

r 

^{qi + 2ai)/(_oo,o)(ai) > q-t . 
1=1 

Now since g = (71 + • • • + (7r> we have 

r r 

t> q- ^{qi + 2ai)/(_oo,o)(ai) > -2 ^ ai/(_oo,o)(ai) , 
1=1 1=1 

and it follows that 

r 

p + t + 2 > p + t >p-2^ai/(_oo,o)(ai) ■ 

i=l 

Hence, (C) implies (C). The strong similarity between Theorem [T] and Corollary \T\ might lead 
the reader to beheve that the proofs of our results rely somehow on Theorem [T] This is not the 
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case. Indeed, the conditions of Corollary [T] arise completely independently. In fact, we do not even 
assume propriety before embarking on our convergence rate analysis (see Secti on [Hi. 



Analog ues of Corollar y [T] for the GLMM with proper priors can be found in 



Johnson and Jones 



mm and 



RomanI (120120 . The proper and improper cases are similar in the sense that geometric 



ergodicity is established via geometric drift conditions in both cases. However, the drift conditions 
are quite disparate, and the analysis required in the improper case is substantially more demand- 
ing. In fact, the only other existing results on ge ometric convergence of Gibbs samplers for linear 



mixed models with improper priors are those of 



Tan and Hoberti (120091) . who considered the cen- 



tered version of the one-way random effects model. Since the centered model is not a special case 
of our GLMM, their results are not special cases of ours. On the surface, the difference between the 
centered and non-centered models may seem inconsequential, but it is well known that MCMC algo- 
rithms corresponding to models t hat differ only by minor reparameterizations can have radically dif- 



feren t convergence rates (see, e.g. 



Gelfand et all 



2011 



19951 : 



Papaspiliopoulos et al 



2007; 



Papaspihopoulos and RobertsI (120081) 



Yu and Meng . 



). Finally, we note that the Unear models considered by 
are substantively different from ours because these authors assume that the variance components are 
known. 

The remainder of this paper is organized as follows. Section |2] contains a formal definition of 
the Gibbs Markov chain. The main convergence result is stated and proven in Section [3] Finally, 
Se ction |4| concerns an int eresting technical issue related to the use of improper priors. An oversight 



by iTan and Hoberti (|2009l) regarding this technicality led to an error in the proof of their main result. 
However, it is shown in Section |4] that their proof is easily repaired (using the results developed 
herein), and that their result remains correct as stated in their paper. 



2 The Gibbs Sampler 



In this section, we formally define the Gibbs sampler, and state some of its properties. Recall that 



= (/3^ u^)T and cj^ = (a^ a^^ ■■■ a^J . Suppose that 

/ 7r*{e,a^\y)de <oo 
for all c7^ outside a set of measure zero in MJ^'^, and that 



< oo 



(2) 



(3) 
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for all 9 outside a set of measure zero in W'^'^. These two integrability conditions are necessary, 
but not sufficient, for posterior propriety. When they hold, we can define conditional densities as 
follows: 



7T{e\a',y) 



and 7r(cr \9,y) 



f^p+,Tr*{e,a^\y)d0 ' f^r+nr*{e,a^\y) da^ ' 

Clearly, when the posterior is proper, these conditionals are the usual ones based on ir{9,a'^\y). 
On the other hand, when the posterior is improper, they are incompatible conditional densities; i.e., 
there is no (proper) joint density that generates them. This incompatibility does not prevent us from 
using these conditionals to run the Gibbs sampler as described in the Introduction, but the resulting 
Markov chain will be unstable. This will be formalized below. 

We now describe a minimal set of conditions under which the integrability conditions are satis- 
fied. Let 9 = (W'^Wyw'^y, and assume that 



2be + \\y -W9f >0 



(4) 



Note that \\Y — VF^lP is strictly positive with probability one under the data generating model. 
Also, dll implies that, for all 9 G RP+'J, 

2be + \\y - W9f = 2be + \\y - W9f + \\W9 - W9f > 2be + \\y - W9f > . 

Define s = min {qi + 2a\ , q2 + 2a2 , . . . + 2a,. , + 2ae } , and assume that 

s > . (5) 

Finally, assume that 

6* > V ? G {1,2,. ..,r} . (6) 

Under dUl, ^ and the integrability conditions, (O and Q, hold, so the conditional densities are 
well defined. Routine manipulation of 7r*(0,(T^[y) shows that 7r(0|(T^,2/) is a multivariate normal 
density with mean vector 

' {X'^X)-^X^[l-{Gl)-^ZQ-^Z^P^)y 
{alr^Q-^Z^P^y 

and covariance matrix 

{{aiy^x^x)-^ + {x^x)-^x'^zQ-^z'^x{xTx)-^ -{x^xy^x^zQ-^ 

-Q-^Z^XiX^X)-^ Q-1 



m 



V 
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where = I - P = I - X{X'^X)-^X'^ and Q = {aD'^Z'^P^Z + D'^. 

Things are a bit more compUcated for 7r(o"^|0, y) due to the possible existence of a bothersome 
set of measure zero. Define A = € {1, 2, . . . , r} : hi = O}. If A is empty, then 7r(o"^|^, y) is 
well defined for every 9 e and it is the following product of r + 1 inverted gamma densities 



7r(cr^|6',y) = /,g( al; ^— + ae,6e + 



N 



\y-we\{^ 



where 



r(c) 





n/. 

i=l 



2 . y« 

Mi' o 



or,.; ^ + fflj, 6j + 



/ig(i';c, (i) 

for c, d > 0. On the other hand, if A is nonempty, then 

/ -K* [6, cj'^\y) da"^ = oo 

whenever 9 G M := {9 ^ 



v<0 



■ rijeylll^*!! ~ O}. The fact that Tr{a'^\9,y) is not defined when 
9 G M is irrelevant from a simulation standpoint because the probability of observing a in TV is 
zero. However, in order to perform a theoretical analysis, the Markov transition density (Mtd) of 
the Gibbs Markov chain must be defined for every 9 G M^'^''. Obviously, the Mtd can be defined 
arbitrarily on a set of measure zero. We define it as follows: 



7r{a^\9,y) 



if9eM . 



. /,G(a,^;i,i)m=i/.G(<;i,i) 

This definition can be used in all cases if we simply define A/" to be whenever A is empty. 
The Mtd of the Gibbs Markov chain, {(^n, cT^)}J^io' defined as 

k{9,a^\9,a^) = T:{a^\9,y)T:{9\d\y) . 

It's easy to see that the chain is -(/^-irreducible, and that ■K*{9,cj'^\y) is an unnormalized invari- 
ant density. It follows th at the chain is positive recurrent if and only if the posterior is proper 



( Mevn and Tweedie , 



1993 



Chapter 10). Since a geometrically ergodic chain is necessarily positive 
recurrent, the Gibbs Markov chain cannot be geometrically ergodic when the posterior is improper. 
The rnargin al sequences, {0„}^q and {fx^j^Q, are themselves Markov chains (see, e.g.. 



LiuetaL 



1994). The cr -chain lives on ^ 



pr+l 



and has Mtd given by 



T:{a'\9,y)T,{9\a\y) d9 , 



and invariant density J^p+q tt*{9, cr'^\y) dO. Similarly, the 0-chain lives on W^^ and has Mtd 



r + l 



n{e\a\y)^{a^\e,y)da^ , 



and invariant density /j^r+i vr*(6', (T^|y) da"^. Since the two marginal chains are also ^/^-irreducible, 
they are positive recurrent if and only if the posterior is proper. Moreover, when the posterior 
is proper, routine calculations show that a ll three chains are Harris ergodic; i.e., positive Harris 



recurrent, ^-irreducible and aperiodic (see iRomanI (120121) for details). An important fact that we 
will exploit is that geometric ergodicity is a solidarity property for the three chains \(0n, o''i)\'^-n , 
\6n}'^-n and la^l^n; that i s, eith er all three are geometric or none of them is (IDiaconis et al. 



2008 



Roberts and Rosenthal 



20011) . In the next section, we prove that the Gibbs Markov chain 



converges at a geometric rate by proving that one of the marginal chains does. 



3 The Main Result 

In order to state the main result, we need a bit more notation. Write the spectral decomposition of 
the non-negative definite matrix P^Z as F^AF, so F is a g-dimensional orthogonal matrix, and 
A is a diagonal matrix containing the eigenvalues of Z^ P^Z, which we denote by Define 
to be a g X 5 diagonal matrix whose diagonal elements, {hiYi^-^, given by 

f 1 A, = 
[ A, / . 

Finally, for i € {1, ... , r}, define Ri to be the qi x q matrix of Os and I's such that RiU = Ui. In 
other words, Ri is the matrix that extracts Ui from u. Here is our main result. 

Theorem 2. Assume that rank{X) = p. Assume further that 2be + \\y — WOW^ > 0, s > and 
bi > 0/or each i € {1,2, ... ,r}, so that the Gibbs sampler is well defined. If the following two 
conditions hold, then the Gibbs Markov chain is geometrically ergodic. 

1. For each 2€{l,2,...,r}, one of the following holds: 

(i) Ui < bi = ; ( ii) hi > . 

2. There exists an s G (0, 1] fl (0, s/2) such that 
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and 

where t = rank(^Z^P-^Zy 

Remark 1. It is important to reiterate that, by themselves, (011, ([Sjl and (O do not imply that the 
posterior density is proper. Of course, if conditions 1. and 2. in Theorem^hold as well, then the 
chain is geometric, so the posterior is necessarily proper. 

Remark 2. A numerical search could be employed to check the second condition of Theorem |2] 
Indeed, one could evaluate the two functions of s at all points on a fine grid of values in the interval 
(0, 1] n (0, s/2). The goal, of course, would be to find a single value of s at which both functions 
take values less than 1. Also, recall from the Introduction that Corollary\l}provides an alternative 
set of sufficient conditions that are harder to satisfy, but easier to check. A proof of Corollary \J\ is 
given at the end of this section. 

We will prove Theorem|2]indirectly by proving that the cr^-chain is geometrically ergodic (when 
the conditions of Theorem|2]hold). This is accomplished by establishing a geometric drift condition 
for the (T^-chain. 

Proposition 1. Assume that rank{X) = p. Assume further that 2be + \\y — 1^6*11^ > 0, s > and 
b'i > 0/or each i € {1, 2, . . . , r}. Under the two conditions of Theorem^ there exist a p [0, 1) 
and a finite constant L such that, for every cj^ € K^"^^, 

E{v{a^)\d'') <pv{a^) + L , (7) 

where the drift function is defined as 

r r 
i=l i=l 

and a > and c > are fixed constants to be determined. Hence, under the two conditions of 
Theorem^ the -chain is geometrically ergodic. 

Proof. By conditioning on and iterating, we can express E{v{a'^)\cr'^) as 



E 



i=l i=l 







1^) 





(8) 
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We now develop upper bounds for each of the four terms inside the square brackets in ([8]l. Fix 

s£ S := (0, 1] n (0, s/2), and define 

r(f + ae-s) 



Gois) = 2 
and, for each i € {1, 2, ... , r}, define 

Gi{s) = 2 



r(f + ae) ' 



r(f + a,) 

Note that, since s G (0, 1], (xi + X2)^ < xf + x| whenever xi,X2 > 0. Thus, 

\\y-W9r^' 



E{ialr\9)=rGois)(b,+ 



< 2^Go(s) 



\\y-we\(^ 



Gois) ( \\y - WefY + const, 



where "const" denotes a generic constant. Similarly, 



E{{alr\9)=rG,{s)ib,+ 



2\ s 



Now, for any c > 0, we have 

E{{al)-'\9)=2-^Go{-c)(be + 



< Gi{s)[\\ui\\'^)^ + const . 



and, for each i G {1, 2, . . . , r}, 

E{{alr'\9)=2-'^G,{-c)(bi + 



2\-c 



< Gi{-c)[{\\uif) '/|o}(&^) + (26.)-^/(o,oc)(&i) • 

Let A = {i : ai < hi = 0}, and note that e(^ Yli=ii'^Ui)~^\^^ be bounded above by a constant 
if A is empty. Thus, we consider the case in which A is empty separately from the case where 
A 7^ 0. We begin with the latter, which is the more difficult case. 

Case I: A is non-empty. Combining the four bounds above (and applying Jensen's inequality 
twice), we have 



E{v{a^)\a^) < aGo{s)[E{\\y -W9f\d^)\ + J] ja^ 

■t=i 

+ Y^G^{-c)E 



(9) 



+ const . 
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Appendix I A. 21 contains a proof of the following inequality: 

S[||?/-T^6l|p|a2] < (p + t) + const . (10) 

It follows immediately that 

E{\\y - Wef\d^)'j ' <{p + ty {alY + const . 
In Appendix IA.3I it is shown that, for each i G {1, 2, . . . , r}, we have 

r 

E[\\uif\a'^]<S,ial + CiY,^lj+ consi , 

i=i 

where = ti{Ri{Z^P-^Z)+R[), Q = tr {RiT^ HF RJ) , and A+ denotes the Moore-Penrose 
inverse of the matrix A. It follows that 



E{\\u,f\a')] < (air + Ct^ialr + const 



(11) 



In Appendix I A.4[ it is established that, for any c £ (0, 1/2), and for each ? e {1, 2, . . . , r}, we have 

-cr(f-c) 



E 



< 2' 



(12) 



where Amax denotes the largest eigenvalue of Z^P-^Z. Combining (ITOl ) - (fT2l ) with we have 



Eiv{a')\a') < a[5,{s) + ^-^) {ajy + Ssis)Y,{al 



(13) 



+ a^^(5-e) '' + '55(c)^ (a^J + const, 

je4 



where 



5i(s) := Go(s)(p + t)^ , ,52(s) := ^e'G,(s) , 53(s) := ^^^^.(s) 



i=l 



54(c) := 2-A^^,, ^G,(-c 
Hence, 



r(f -c) 
r(|) 



and (55(c) := 2 '^max 



1=1 



Gd-c 



r(| - c) 
r(|) 



where 



S(u(o-^)|ct^) < p{a, s, c) u(a^) + L 



P(q,s,c) = max<^ di[s) H ,c'3(s), ,d5[c] 

a a 



12 



All that remains is to show that there exists a triple (a,s,c) G M+ x 5 x (0,1/2) such that 
p{a,s,c) < 1. We begin by demonstrating that, if c is small enough, then ^5(0) < 1. Define 
a = — maxi^A Oi- Also, set C = (0, 1/2) n (0, a). Fix c G C and note that 



(55(c) = max 



r(| + a, + c)r(f -c) 



r(f + «,) r(|) J • 

For any i G A, c + Oj < 0, and since s > 0, it follows that 

0<— +ai<— +ai + c< — . 

But, r(x — z)/r{x) is decreasing in x for x > z > 0, so we have, 

r(| + a,) _ r(f + ai + c-c) ^ r(f -c) 



r(| + Oi + c) r(| + a, + c) r(f) ' 

and it follows immediately that 6^ (c) < 1 whenever c G C. The two conditions of Theorem |2]imply 
that there exists an s* G such that di{s*) < 1 and 53(s*) < 1. Let c* be any point in C, and 
choose a* to be any number larger than 

max < ' ' ■ oa[c 



l-(5i(s* 

A simple calculation shows that p{a* , s*, c*) < 1, and this completes the proof for Case I. 

Case II: ^ = 0. Since we no longer have to deal with E[{a'^J^'^\6), the bound ([T3l) becomes 

and there is no restriction on c other than c > 0. Hence, 
where 

p(a,s) = max|(5i(s) + ^^—,53(s] 
I a 

All that remains is to show that there exists a (a, s) G M+ x S such that p{a, s) < 1. As in Case 
I, the two conditions of Proposition |2] imply that there exists an G S such that 6i{s*) < 1 and 
^sis*) < 1. Let a* be any number larger than 

A simple calculation shows that p{a*, s*) < 1, and this completes the proof for Case IL 
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The only thing left to do is to show that Q implies that the cj^ -chain is geometrically ergodic. 
Note that the cr^-chain is ■i/'-irreducible and aperiodic. Moreover, because its Mtd is strictly positive 



on Ml+i X 



, its maximal irreducibili ty meas ure is equivalent to Lebesgue measure on 



r+l 



Thus, it follows from 



Meyn and Tweedid 's (119931) Lemma 15.2.8 and their Theorem 6.0.1 that, if 



the cr^ -chain is a Feller chain and the drift function is unbounded off compact sets, then (O implies 
that the cj^ -chain is geometrically ergodic. 

We first show that the drift function is unbounded off compact sets; i.e., we will demonstrate 
that, for every d € M, the set 

5rf = {a' G : v{a') < d} 

is compact. Let d be such that Sd is non-empty (otherwise is trivially compact), which means 
that d and d/a must be larger than 1. Since v{a') is a continuous function, is closed in MJ^^. 
Now consider the following set: 

Td= [(d/a)-^/^(d/a)l/^] X [d~^/',d^/'] X ••• X [d~'^/',d^/'] . 

The set is compact in M''+^, and hence in M!|_^^. Since Sd C Td, Sd is a closed subset of a 
compact set in M!j_^^, so it is compact in M!,_^^. Hence, the drift function is unbounded off compact 
sets. 

To complete the argument, we must show that the cj^-chain is a Feller chain. Let Pi denote the 
Markov transition function of the cj^-chain; that is, for any a' € and any Borel set A, 



Pi{a',A)= I ki{a'\a')da' 
J A 



The chain is Feller if, for each fixed open set O, Pi{-,0) is a lower semi-continuous function on 
W:^^. To this end, let {o-^l^^^ be a sequence in that converges to a' G R++\ Then 

liminf Pi(cj^, O) = liminf / ki{a'\a'^) da' 

m— >oo m— )-cxD Jq 

7T{a'\e,y)7r{d\al,y)d0 



lim inf 



RP+1 
2| 



> / / 7r{a'\e,y) liminf 7r(0|a4, y) 



da' 
dO da' 



o 



7:{a'\e,y)7:{e\a',y)de 



da' 



Piia',0) 
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where the inequahty follows from Fatou's Lemma, and the thir d equality follo ws from the fact that 
7r(0|a^,y) is continuous in cr^. (For a proof of continuity, see iRomanI (l2012r) .) We conclude that 
Pi{-,0) is lower semi-continuous, so the cj^-chain is Feller. The proof is now complete. □ 

We end this section with a proof of Corollary [T] 

Proof of CorollaryUl It suffices to show that, together, conditions {B') and (C) of Corollary [T] 
imply the second condition of Theorem |2] Clearly, {B') and (C) imply that s/2 > 1, so (0, 1] fl 
(0, s/2) = (0, 1]. Take s* = 1. Condition (C) implies 



2-^ (p + t) 



p + t 

N + 2ae-2 



< 1 



Now, since tr(iJ) is the number of zero eig envalues of Z^P^Z, we have 

r r 

J2ir{Rir^HTRf) = tr(T^Hr^RfRi^ = t3:{T^Hri) = tr{H) = q-t . 



1=1 



i=l 



Hence, 



i=l 



r(f + a, -s^ 

r(| + a,) 



txiRiV^HTRj 



tr:{RiT^HTRf) 



< 



mm 



mm 



^ qi + 2ai-2 
Z:=MR^^^HTRf) 
i&{i,2,...,r}{Qj + 2aj - 2} 

q-t 

ie{i,2,...,r}{Qj + 2aj - 2} 



< 1 



where the last inequality follows from condition {B'). 



□ 



4 Discussion 



Our decision to work with the cj^-chain rather than the 0-chain was based on an important technical 
difference between the two chains that stems from the fact that 7r{a'^\9, y) is not continuous in 9 for 
each fixed cr^. Indeed, recall that 



/iG 



\\y-we\\ 



/.G(^e;i,i)m=i/.G«;i,i) 



ni=i /ic 



a: 



2 . 2i 

Wi' 2 



+ ai,bi + 
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Also, recall that the Mtd of the a^-chain is given by 



vr (7 



y)T,{e\d\y) dO. 



W+1 



Since the set TVAhas measure zero, the "arbitrary part" of 7r(cj^|6', y) washes out of ki. However, the 
same cannot be said for the ^-chain, whose Mtd is given by 

k2{e\e)= [ Tr{e\a'^,y)TT{a^\e,y)da^ . 
This difference between ki and k2 comes into play when we att empt to apply certain "topo logical" 



Mevn and Tweedie 



(Il993h . In par- 



results from Markov chain theory, such as those in Chapter 6 of |] 
ticular, in our proof that the cr^-chain is a Feller chain (which was part of the proof of Proposition [T|l, 
we used the fact that 7r{9\a'^,y) is continuous in cj^ for each fixed 9. Since 7r{a'^\9, y) is not contin- 
uous, we cannot use the same argument to prove that the ^-chain is Feller. In fact, we suspect that 
th e 0-chain is not Felle r, and if this is true, it means that our method of establishing the applicability 
of iMeyn and Tweedid 's (Il993h Lemma 15.2.8 will not work for the ^-chain. 



It is possible to circumvent the problem described above by removing the set N from the state 
space of the 0-chain. In this case, we are no longer required to define 7r((j^|6', y) for 9 G M, and 
since 7r(cr^ [0, y) is continuous (for fixed cr^) on \ J\f, the Feller argument for the 0-chain will 
go through. On the other hand, the new state space has "holes" in it, and this could complicate the 
search for a drift function that is unbounded off compact sets. For example, consider a toy drift 
function given by v{x) = x^. This function is clearly unbounded off compact sets when the state 
space is M, but not when the state space is M \ {0}. The modified drift function v*{x) = + 
is u nbounded off compact sets for the "holey" state space. 



Tan and Hoberti (|2009h overlooked a set of measure zero (similar to our M), and this oversight 



led to an error in the proof of their Proposition 3. As we now explain, their proof can be repaired 
using the ideas described in the pr evious paragraph. Our work shows that their Proposition 3 is ac- 



tually correct as stated. Recall that 



Tan and Hoberti (|2009h (hereafter, T&H) considered the centered 



version of the one-way random effects model, which, in their notation, is 



+ e. 



where i = 1, . . . , g, j = 1, . . . , m^, the 0jS are iid N(/i, (t|) and the e^jS, which are independent of 
the 6'jS, are iid N(0, a1). They considered a parametric family of improper prior densities given by 
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where a and b are known hyper-parameters. Let cr^ = (ug, cig) and ^ = [fi,9i, . . . ,9g). T&H 
analyzed the ^-chain, {Cn}^=o, which was defined to have state space W^^, and Mtd 



7r(||o-^,y) vr(cr^|^,y) da"^ . 



The conditional density 7r(o"^|^, y) is the product of two inverted gamma densities, one of which is 
the source of the problem. Indeed, 7r((j||^, y) is inverted gamma with shape q/2 + a and scale 

q 

2 



1=1 

T&H overlooked the fact that this density is not defined on the set 

Thus, the Mtd k is not well-defined for ^ € A^*, and, as a result, T&H's argument showing that the 
^-chain is Feller (as a chain on M'^+^) is incorrect. 

T&H's proof can be repaired by redefining the state space of the ^ chain to be W^^^ \ M*. For 
fixed cr^, 7r(cr^|^, y) is a continuous function of ^ on M^^^ \ J\f*, and it follows that the ^-chain is 
Feller on the new state space. Now, the drift function that is used in T&H's proof of Proposition 3 
takes the form 





s 
+ 







i=l 



where e > and s € (0, 1]. This function is unbounded off compact sets when the state space is 
M"^"^^, but not when the state space is M'^+^ \ J\f*. To remedy this problem, we add the following 
term to the drift function 



i=l 



where c > 0. Since this function blows up as ^ approaches J\f*, the modified drift function is 
unbounded off compact sets on the new state space, R'^+^ \ M*. Straightforward calculations (using 
techniques similar to those employed in our Appendix IA.4I ) show that the modified drift function 
still satisfies a geometric drift condition, which implies that the £-c hain defined on R'^ +^ \ A/"* 
is geometrically ergodic. Moreover, this result, in conjunction with iMeyn and Tweedid 's (119931) 



Theorem 15.0.1, implies that the original ^-chain (defined on M'^+^) is also geometric. Thus, T&H's 
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result is correct as stated. For a more detailed v ersion of the corrected proof, as well as an alternate 
proof based on the cj^-chain, see lRomanI (120 12h . 
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Appendices 

A Upper Bounds 

A.l Preliminary results 

Here is our first result. 

Lemma 1. The following inequalities hold for all a"^ G MJ]^^ and all i € {1, 2 . . . , r}: 



Proof. Recall from Section [3] that F^AF is the spectral decomposition of Z^P^Z, and that H is a. 
binary diagonal matrix whose ith diagonal element is 1 if and only if the ith diagonal element of A 



1. Q-^ ^{ZTp^Z)+al + T^HT{jy., 



<) 



2. tr{P^ZQ-^ZTp^) < rank{Z^P^Z)al 



is 0. Let a1 



a'^.. Since (u^) Iq ^ D ^, we have 



(alr'Z^P^Z + {alr'lq ^ {alr^Z^P^Z + D 



,-1 



and this yields 




(14) 



Now let A+ be a diagonal matrix whose diagonal elements, {A^}?^ 



are given by 



= < 



Ai = . 
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Note that, for each i £ {1,2, ... , r}, we have 
This shows that 

Together with (fT4l ). this leads to 

Q-i ^ r^(A(cT2)-i ^ r^(A+cT2 + /7cT2)r = {Z^P^Z)+a''^+T^HTal , 

which proves the first statement. Now let Z = P^Z. Pre- and post-multiplying the first statement 
by Z and Z^, respectively, and then taking traces yields 

tr(ZQ~^Z^) < tr(Z(Z^Z)+Z^)cr^ + \x{ZT'^ HTZ'^)al . (15) 

^mce is idempotent, we have 

tr(Z(Z^Z)+Z'^) = tr(Z'^Z(Z^Z)+) = rank(Z'^Z(Z^Z)+) = rank(Z^Z) . 

Furthermore, 

tr{zr'^HTz'^) = tr{T'^HTz'^p^z) = tr(r^ijrr'^Ar) = tr(r'^FAr) = , 

where the last line follows from the fact that HK = 0. It follows from ([13] ) that 

\x[P^ZQ~^Z^P^) < mnk{Z'^ P^ Z)al , 

and the second statement has been established. Recall from Section [3] that A^ax is the largest 
eigenvalue of Z^ P^Z, and that Ri is the qi x q matrix of Os and I's such that RiU = ui. Now, fix 
z € {1, 2, . . . , r} and note that 

Q = {alr^Z^P^Z + D-^ ^ {<yl)-^\m.Jq + . 

It follows that 

and since these two matrices are both positive definite, we have 

{PiQ-^Rjy^ < [Ri{{al)-^\^^J^ + D-^)-^Rjy^ 

and this proves that the third statement is true. □ 
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'L&mmsi 2. The function h{a'^) := \\{a1) ^ Z'^ P^y\\ is bounded on W^^. 

iJ) will be used in the proof of Lemma |2]. 



The following result from 



Khare and HobertI (120 11 



Lemma 3. Fix n E {2, 3, . . . } and m € N, and let ti, . . . ,tn be vectors in M™. Then 

n _2 
Cm,n(ii;t2, • • • ,in) := SUp tf ( titf + Cjtjtf + Ci/ ) ti 

is finite. 

Proof of Lemma^ Let and yi denote the ith column of Z'^ = (P-^Z)'^ and the ith component 
of y, respectively. Then, 



(Z^P^Z + a^D-^'z^P^y 



Ziyi 



i=l 

N 

i=l 

N 

= E 

i=l 
N 

i=l 



iZ^Z + alD-^) ^iiv. 



N X -1 



ZiVi 



ZiZi 



+ H ZjzJ + alD A i 

je{l,2,-,N}\{i} ^ 



Therefore, it is enough to show that for each f G {1, 2, . . . , A^}, 



T 

ZiZ, 



+ ~ZjzJ + alD-A 

^ ^ f-i o AT'} \ r,-T ^ 



-1 



jG{l,2,...,iV}\{i} 



is bounded. Now, 



^ je{i,2,...,N}\{i} 



-2 



Zo 1 Zi 



ZjzJ + a^D ^ 



+ z,zJ + aUD-' 



j€{l,2,...,N}\{i} 



2 -^g j + ^2 I Zi 



N+q 



< sup tfititj+ 'Y '^jtj'l^j + Y^ CjtjtJ + Cilgj ti 

0^++" ^ j€{l,2,...,N}\{i} j=N+l ' 

where, for j = 1,2, . . . , N, tj = Zj, and for j G {N + 1, . . . , N + q}, the tj are the standard 
orthonormal basis vectors in M''; that is, tj\[+i has a one in the Ith position and zeros everywhere 
else. An application of Lemma |3] shows that Kf{a'^) is bounded. Hence, h{a'^) is bounded. □ 
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Let Xfc(^) denote the non-central chi-square distribution with k degrees of freedom and non- 
centrality parameter w. 



Lemma 4.IfU^ xli'^) '^"'^ 7 ^ (Oi tl^n 




Proof. Since r(x — 7) /r(x) is decreasing for x > 7 > 0, we have 



1=0 ■'^^ 



■+ 



u 



r(| + i)2l+* 



1 



du 




□ 



A.2 An upper bound on £"[11?/- VF6'|p|cr2] 

Recall that 6 = n^)^, W = {X Z), and that ■K{9\a ,y) is a multivariate normal density with 
mean m and covariance matrix V . Thus, 



tr(l^FM/^) = al\x{P) +\x{PZQ-^Z'^P) - 2tr(ZQ"^Z^P) +tr(ZQ"^Z^) 
= pal - tr{ZQ-^Z^P) + ti{ZQ-^Z^) 
= pal + tr{ZQ-^Z^P^) 
= pal + tr{P^ZQ-^Z^P^) 
< pal + rank{Z'^ P^ Z)al 

= {p + t)al, (17) 
where the inequality is an application of Lemma [T] Finally, a simple calculation shows that 

y - Wm = P^[I- {aly^ZQ^^Z^P^] y . 
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E[\\y - Wefla"^] = tr{WVW^) + ||y - Wm 



(16) 



and we have. 



Hence, 

\\y - Wm\\ = \\P^y - {a^)-^ ZQ'^ P^y\\ 

<\\P^y\\ + \\{a^er'P^ZQ-^Z^P^y\\ 
<\\P^y\\ + \\P^Z\\\\ialr^Q-^Z^P^y\\ 

< const , (18) 

where j|-|| denotes the Frobenius norm and the last inequality uses Lemma |2l Finally, combining 
Gill, CH) and (dH) yields 

E[\\y - Wefla"^] <{p + t)al + const . 

A3 An upper bound on £'[11 IP I a^] 

Note that 

k1 = E[\\Riuf\a^] = tv[RiQ-^Rj) + \\E[Riu\a'^] |f . (19) 

By Lemma [U we have 

r 

tr{RiQ-^Rj) < tr{Ri{Z^P^Z)-^Rj)al+tr{R,T^HTRf)Y,^l, 

r 

Now, by Lemma |2] 

||£;[i?i'u|cr2] II < ||i?i||||S[n|a2] || = pi||/i(a2) < const . (21) 
Combining ^ and ^ yields 

r 

i=i 

A.4 An upper bound on E'f (II -Uj II ^) "[cr^] 

Fix i G {1,2,..., r}. Given u^, {RiQ^^ Rf)^^^^Ui has a multivariate normal distribution with 
identity covariance matrix. It follows that, conditional on o"^, the distribution of uf {RiQ~^ Rf)~^Ui 
is Xqi{w). It follows from Lemma|4]that, as long as c € (0, 1/2), we have 

,-cr(|-c) 



E 



[ui{R,Q-'Rj) 
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< 2" 



r(|) 



Now, by Lemma [U 



E 



{\\u^\?yy 



—c 










^2 



r(i-c) 



r(|) 



< 2" 



. r(f-c) 
r(|) 
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